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. , Matching test (MT) construction techniques^nere 

compared. These included: (a) HT instruction that limit the number of 
times responses can be selected to once, father than more than once; 
(2) BTs. that are organized in groups of five,, premises^ or stems, 
rather than in groups of ten premises; and '(3) MTs designed to 
measure knowledge, rather than a higher-order achievement level such 
•as comprehension or synthesis* The subject's were 196 .undergraduate 
students enrolled in aji introductory educational psychology course 
offered at the University of Pittsburgh. Significantly higher scores 
were observed for MTs organized in groups' of five premises and for 
HTs .designed, to measure higher-order achievement levels. A 
recommendation is made to organize' MTs in groups of five premises. 
The absence of significant interactions suggests that this 
recom.mendation may apply for MTs' designed to measure either knowledge 
or some higher order achievement. Also, a special advantage of MTs in 
the assessment of partial knowledge is ^scussed. (Author/ECJ 
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THt CONSTRUCTION OF MATCHING TESTS I 
AN' WIRICAL STATIMINT . 

aSEQO^ A. SHANNON . \ 
Pttimiylvftula ])«paz*ts«nt of 'Sluoatlon 



A matohlng tast (KT) Is a fpra of a multlpla-chbloe t^t for 

\ ^ 

whloh tba premises, or steas, and the response options a^ eaoh 

typloally arranged In separate oiljuma. The testees are asked to 

- ' \ 
matoh eash premise with the appropriate response or responses. 

/ ' ■ \ ' . 

An NTs Is usually broken down Into groups of premises and responses 

■ r ' . ■ \ — 

whloh are .0 all ad natohlng exerolses (KIs). Most dlsousslons of 
MT oonstructlon were Included In puWloatlons for whloh empirical 
support was not apparent (Odell, 1928, Lang, 1930, Kbel,, 1951, 
and Wesman, 1971). j - 

Const ruotlonal aspects of major concern Include MT' Instructions 
MS length, and cognltlre aohieveitent levels measureable by Mts. 
As opposed to restricting the number of times a given response 
option may be selected, Wesnan (1971) f«lt that non- restricted MT ^ 
Instructions would ylald the best type of MH'soQres. What Wesman 
meant by best MTis seamed to be MTs which would be more difficult 
to complete, and thus, less susceptible to testee traits such as 
guessing and risk tasking abilities. 

31 'pasted MX lengths have ranged from five premises (Ibel, 
1951) to thirty or more (Oood, 1927). Ml length was defliied by 
the number of premises assigned to the MS. Thus, an MS which 



Thli^ paper Is a summary of the author's Masttsr's thesis at the 
University of Pittsburgh. 

I would like to thank Drs. Henry Hausdorff, Richard Cox, Robert 
authrle, Stanley Jacobs, and Charles Stegman for serving on my 
committee. 



consiatad of five prealias was defined as a flva-praffllsa Mi. 
In general, It was agr««d that MT% would be more difficult to 
ooaplete as KC length it Inoreased. 

Although MTs may be oonatruo.ted to measure knowledge (Wasaan, 
1971) and higher-order aohievement levels (Fay, 1929, Sbel, 1951, 
and Secord, 1952), the effect of measuring lower-order and higher- 
order cognitive achievement levels (Bloom, 1956) upon MT scores 
remains to be examined. It ;does seem reasonable that MTs coni- ■ 
structed to measure tilgher-order achievement levels ^ould 
mora difficult ^^:bhan MTs oonstruotad to meMura knowljtdgee 

It has baan tha author' a axparlanoa that )ktA ara oomaonly 
balng oonatructed and uaad olAaaroom taaohara to aaaaaa aohlava 
mant. 31naa guldallnaa amplbyad by taaojwra In conatruotlng MTa 
ara vary muoh In naad of raaaaroh fortif loatlon. It la tha purpoaa 
of this atudy to afflplrioally atrMstbait tldLa aupport. .flba NT 
oonatruotlon taorhnlqUaa Invaatlgatad In this atudy Includadt 
^(a) tha uaa of Ml Inatruotlona that aak tha taataa to aa}.aot a 
glvan raaponaa option only onoa, as oppoaad to, MT Inatruotlona 
that paralt tha tastaa to aalaot a glvan raaponaa option onoa or 
fflora, (b) oonatruotlng MTs that oonalatad of f lva*-pramlaa MSs, as 
oppoaad to, tan-^pramlaa Mia, and (o) oonatruotlng. MTa as maaauraa 
of knowladga, ai^ oppoaad to, odnatiruatlng MTa aa maaauraa of 
hlghar-ordar achlavamant lavala auoh as oonpra^nalon or ayhthaals 
It was hypothaalzad that blghar KT sooraa would ba obaarvad whan 
tha MT Inatruotlona raatrl^t raaponaa salaotlon tp ohpa,. whan tha 
MTs oonalat of flTa*-praAlaa Ka^ and vbasHtha MTa ara oonstruotbNA; 
as maaauraa of knowladga. It waa also hypothaalsad ilsat no 



Interftotlona would be obstrv#d between the oonetruotlonal 
Tarlablea under Investigation. 

METHOD 

Materlala . ' 

A premise*- response (P-^R)'pool was oonstruoted to oonform to 
the final instructional phase of an introduotorx educational 
psyohology course at the University of Pittsburgh, the re- 
sources consisted of the course texts and an existing pool of 
multiple-choice test itcas. Premises were constructed and paired 
with oorrect responae.. Th. Initial pool of P-R pair, was re- 
viewed by the oourse lUBtruotjwMTor content validation puirpoaea 
and revised. Sxfuilnatlons were also made to Insure that eash 
prenlse listed had been a8si8>^ed a unique response. For the pur^ 
pose of classifying the F-R pairs Into aohlevefflent levels, three 
graduate students who bad oiiipiei.A Introduototy eduoatlonal 
psyohology oourses served as Judg:es; They were eaoh presented 
with the revised P-R pool along with written instructions which 
asked them to select the P-R pairs which they Judged to be 
measures of knowledge as defined by Krathwohl and Payne (1971); 
P-R. pairs that were unaniaously Judged to be meastTres of knowledge 
were olassifled accordingly and P-R pairs unanimously Judged to 
be measures of some other cognitive ability were classified as 
measures; of some higher^order achievemant level. It was assumed 
that knowledge occupied the lowest cognitive level of the Tax- 
onomy defined by Bloom and that the hierarchy was exhaustive. 
P-R pairs which only two of the Judges classified were reviewed 
by a professor within the Department of Educational Psychology 



*and either dlasslflad or allalnated. ^ 

The 'remaining P-*R pairs were oonatruqted into MXa and re-- 

^^ " 

viewed by the course Instruotor and two graduate aaiietants to' 
Inaure that eaoh responae originally paired with a given premiae 
would i^emain the moat reaabnable ohoioe. A final revision waa 
made. All^MXa oonaiated of thirty premiaea and thirty-*aix re-- 
aponae.a. 

The MT inatruotlona. oonaiated of two forma; reatrioted and 

unreat rioted. Reatrioted inatruotioha were atated aa foUowsi 

Giomplete eaoh atatement (numbered) with the moat oor- 
reot reaponae (lettered). Enter eaoh reaponae in the 
blank preceding eaoh atatement. Do not aeleot any given 
reaponae more than onoe. Pleaae do not gueaa . 

The unrestricted initructiona were atated aa followat 

Complab. eaoh atat.aant (numb.r«d) with th. moit oor- 
raot raspons. (latttrad). Sntar all raaponaai In tha 
^blank. praoadlng aaob atatafflant.. Any raaponaa may ba 
sal ao tad mora than^onoa. Plaasa do not gua^s. 

Tha two M£ langths wara flva-^prainlaa Mia (with six rasponsas) 

'and tah-premlsa MCs '(with twalva rasponsas). laoh MT fons son- 
aiated of thirty premiaea broken down into either five-premise 
MCfl or ten-premiae MCa. 

The three olaaaif io'ationa of cognitive achievement whlohi the 
HTa were conatructed to meaaure were lower<-orderi higher-- order ^ 
and a combination of lower-order and higher-order achievement . 



aubje cbs 

The 3a were one-hundred and ninety^- aix undergraduate atudenta 
enrolled in an introductory e4ucatioital paychology courae offered 
by the Department of Iducational Paychology at the University of 



Pittsburgh..' The majority of th« Sa wtre f«m*le. Th« MPi w«r» 
administered as pre^testa for the final Instruotlonal unit of 
the course. The scores were not included In the Ss' course grades. * ' 

Prooedure 

The 38 ware assembled in a small auditorium, twelve forms 
of the Kfs (two fonts of MT Instipuotiona X two MI Isngths X three 
levels of cognitive aohlevemant) were randomly assigsned to the 
3s. The 38 were advised that certain, of the MTs were different 
in regard; to the instruotfbns and the MS lengths. Testing was 
completed within the hour normally scheduled for the course. 
Because of typographical errors one premise from six of the MT 
forms and three, premises from two of the MT forms. were not in- 
oluded in the data analysis. 

Design 

A 2 X 2 X 3 full rank deiigiii was employed. ' The data were 
analysed using the univariate prboedure described by Tlmm and 
aarlson (1973). Achievement was the primary dependent variable. 
Although subtest sooraa were computed for grading purposes, only ' 

the total scores Vere included. in the ,data analysis. 

■ i ■ " ' 

I 

; RSSULXS 

i ' ' . 

Peroentas^ scores ware computed for the MPs* Means, standard 

deviations, arid ranges (by treatment groups) are included in 
Table 1. , 



ABC 111 
ABC 211 
ABO 121 
ABC 221 
ABC 112 
ABC 212 
ABC 122 
ABC 222 
.ABC 113 
ABC 213 
ABC 123 
ABC 223 



TABLE 1 

Dlatributlon by Trtatmtnt Group (C til) of 

" r ■ • . . . ■ ■ * 

MaanV, Stftadard Deviations and Ranges 



Cell 


. Number afeor 

. .'■ ."V . ■ 


standard 




(Group) 


til ' ' 


Dtviatlon* 





4 
19 
• 17 
10 
25. 
17 
18 
10 
14 
31 
12 
19 



29 

29 

29 

29 

27 

27 

29 

29 

30 

30 

30 

30 



81.90 

^8.97 

65.31 

62.07 

79.26 

76.69 

67>3 

71,03 

84.05 , 

83.23 

81 /li 

73.68 



7.11 
27.34 
10.73 
19.71 
10.69 

io;p4 

13.34 
.11.86 
7.53 

* > 

11.59 
7.70- 
15.63 



A ~ MT Instructions : g - KI. Lsn/^th i g_ 
Level 1 - Unrest rts ted Five-Premise MEs 

Level 2 - Restricted - Tfin-Premlse-MBsr' 

Level 3 



75.86 -89.66 
6.90 - 96.55 
48.28 - 86.21 
27.59-89.66 \ 
44.44 -100.00 
59,26 - §2.59 
/ 41.38 - 89.66 
51.72 - 89.66 
70.00 - 96.67 
40.00 -100.00 
70.00 - 93.33 
20.00 - 93.33 

- Aohleveaent Lertlst 

- Lower-Level 

- Cdaposite 

-■ Hlgher^Level 



• These xmblased .eat Imates. were not adjusted. Adjustments 

may b? made by 'multiplying them by (lOO/N*) , ^wlth N being the 
appropriate number of Items. / 



Th«.F-ratlo8 abmputed for both thh fl faotor (MS Itngfch) an^ 
th« a* factor (aohi«vfm«nt lev«la) waMyalgnlf Isant will beyond tha 
..01 laval. rha analyaia of variandar .aummary tabia la preaantad 

balow (Table 2). The meana observed for Vaotor B were 78.91 and 

* ■ - — ■ \ 

70.10, respectively. The mean obaerved for Y^va-pramlaa MCa waa 

higher than the mean obaerved for ten-premlaa Ml 

XABLS 2 

aeaulta of ANOVA for ..Relationship among 'Mr In«truotiona\MS 
' ■ length, Achievement Levela and Achievement 



-Source df 33 F-Rii^il 



o 



Mr Inatruotlons (A) 1 386..63 2.20 

MS Length (B) 1 . 3473.70 19.76* 

Aohlevament levels (C) 2 2724.71 7.75* I 

A X B ■ 1 26.59 .15 

A X 0 2 " 290.81 .83 

X- 0 2 328753 .93 

A-X ft X 0 2 I 390\36 1.11 * 

Within 184 ^2340. 67 175.76 

*p<.01 

The means observed for fatjtor C were 67.38 for lower-level 
achievement, 74.42 for composite, and 80.66 for higher-level . 
achievement. A systematic Increase In scores aoi^oss achievement 
levela was indicated. 

The P-ratio computed for the BO interaction .served also to 
suggest the absence of a significant interaction. Thus, inter- 
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pr«tation of the-is.aln «ff«oti, E and. G', wap tlmpllfled. The 

■yttMAtlo Inoreaie of means obaer\i^ad for factor G (aohltvement 

' , ♦ '■ 11- 

l«Vtlf) was oisserved for two. levels of B (MS Length). This In- 
orease was In the* ordsR of lower-level, ooaposlte, and higher^ 
level ^aohlev efli en t, respeotlvely.' Ixpresaed In terms ot factor B„ 
means observed .for five-premise MSs were higher than the means 
observed for ten-premise MVs, across the three levels of factor C. 

^ DiCU33I0K ' 
The findings of the present^ Investlgatiott Indicate the fol- 
lowing concluaTonii ' , \" ' ■ 

1. Tfie experimental hypothesis that unrestricted MT: instructions 

, would result in Idwer mean, score* than-restTlcted instructions, 
was not supported i • 

1 

2. The experimental hypothesis that tent-premise MGs would result 
in lower mean scores than five-p2*emise Mis was supported.. 

3. The experimental hypothesis that meam scores woujd decrease 
as. achievement level was Dicreased was not supportea, although 
the -reverse situation was supported. 

4. ■ The experimental hypothesis that there would be no significant 

Interactions among the thr^e main- factors was supported.. 

Prom the jTlndings it' Is recommended that the number of premises 
to be included within a matching exercise' be held to approximately 
five premises, with a greater number of response options "to reduce 
guessing. Increasing matching exercise -length places greater de- 
mands on testeee' skills which may not be of immediiLt.e oomoem • 
such as reading comprehension, attentiVeiliss, and organization;. 
The practice of test toughening by ia^etsslBg the length of the 

\ 



natohlng..«x«r3l8«i will probably r«duo« reliability and Intar- 
pretmblllty of the sooreii. Xho mbsenoe of •Ignlfloaat Interaotlonfl 
iugg^t9^1i*t tht mboTe r#3oiipandatlon may raakonmbly apply, for 
mmtohlng taata daalgnad to maaaux^ aithar knowladga or soma 
hlghar*-ordar>ohlavamant lavii auoh m ooisprahanaion or aynthaaia. 
This la good beoauaa most matohlng t-aata qonatruotad for olaaarocm 
purposaa maaaura a mlxtura of lowar-ordar and hlghar-ordar aohlava^ 
mant . ' 

Alt\Of It Is auggastad that matohlng taata hava an advantage 
ovar Bultlpla-oholoa taata In tha^ .assaaaaant of j^^^ knowladga. 
^Whl-ia'^^onaTz^ matohlng. tastai tha author had an oppor^ 

tunlty to maka aubjaotlva poMparlaona wltli tha mult Ipla-oholoa 
taat. Itama baaad upon\ tha aaaa material. Kultlpla-^oholoa taata 
oonaiat of items whloh inoludre one atea and a apeoifled number of 

\ i 

reaponses. Matohlng teats oonalat of matohlng exerolaas whloh 
Include a, response fdj^r aaoh premlaa. The pperatlon of .partial 
knowledge \b preaent with both types of tests. However, i[ four- 
option multlple*-choloe test item would normally fall to aeasss 

« i 
tea tees '\ knowledge of three of the four options. Whereaa, a fou.t'r 

premise matohlng exerolae (.with five response options) may fall 

to assess knowledge of only one dlstraotor. I>) seams reaaonable 

that teateea wpuld have a greater opportunity to demonstrate their 

total knowledge on matching tctata rather than on multiple-choice 

teats. 

\ 

The matching tests became easier as the achievement levels 
were Inoreaaed. The explanatloii^ 4ay follow from the fact that 
higher institutions prepare atudents to become the thinkers, rather 

ii 
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than th« mamorlzere of 4ooltty. In time, atudenta l««rn to Adjust 
bo this f*oti Th'i author ir«»l8 that thla la th« major rtaaon why 
tht studants who took tha matdhlng^ t'aitt wara battar praparad bo 
•damonatrate thalr undaratanding of ttit oourae aubjaot aattar, 
rather th'an to damonatrata thalr ability to raoall dat^lla. 

/ ' • ^ . . 
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